Skip to content

Conversation

firecoperana
Copy link
Collaborator

The following changes are included in this PR:

  1. Change reasoning format's default value to auto and make current webui compatible with this change (
    Support streaming delta.reasoning_content in WebUI ggml-org/llama.cpp#15052). This should be fine with most 3rd party front end as they should be updated by now. If not, use --reasoning-format none to start server.

  2. Add config in current webui for reasoning format. When encountering parsing issue for reasoning content, switch between none and auto to see which one fix them.

  3. server : include usage statistics only when user request them (
    server : include usage statistics only when user request them ggml-org/llama.cpp#16052). This could be a breaking change, but is compatible with openai API.

  4. server : only attempt to enable thinking if using jinja (
    server : only attempt to enable thinking if using jinja ggml-org/llama.cpp#15967)

server : include usage statistics only when user request them (#16052)
server : only attempt to enable thinking if using jinja (#15967)
@firecoperana firecoperana self-assigned this Sep 24, 2025
@ikawrakow ikawrakow merged commit 09db3a4 into main Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants